Near fine grain parallel processing using a multiprocessor with MAPLE
نویسندگان
چکیده
Multi-grain parallelizing scheme is one of effective parallelizing schemes which exploits various level parallelism: coarse-grain(macro-dataflow), medium-grain(loop level parallelizing) and near-fine-grain(statements parallelizing) from a sequential program. A multi-processor ASCA is designed for efficient execution of multi-grain parallelizing program. A processing element called MAPLE are mainly designed for near-fine-grain parallelism, and has two modules called MAPLE core and DTC. The MAPLE core is a simple RISC processor which executes every operation in a fixed time and realize direct register to register transfer. The DTC realize a software controlled cache by instructions which are generated by the compiler. With a static scheduling, near-fine-grain parallel processing is efficiently performed using a communication mechanism with receive registers, and non-synchronization operation mechanism. Through implementation of the prototype chip and clock level simulation, it appears that the performance of a single chip multi-processor with 4 MAPLEs is close to those of modern super-scaler processors in spite of small hardware and low clock frequency.
منابع مشابه
Near Fine Grain Parallel Processing Using Static Scheduling on Single Chip Multiprocessors
With the increase of the number of transistors integrated on a chip, efficient use of transistors and scalable improvement of effective performance of a processor are getting important problems. However, it has been thought that popular superscalar and VLIW would have difficulty to obtain scalable improvement of effective performance in future because of the limitation of instruction level para...
متن کاملEvaluation of Single Chip Multiprocessor Core Architecture with Near Fine Grain Parallel Processing
متن کامل
Coarse-Grain Task Parallel Processing Using the OpenMP Backend of the OSCAR Multigrain Parallelizing Compiler
This paper describes automatic coarse grain parallel processing on a shared memory multiprocessor system using a newly developed OpenMP backend of OSCAR multigrain parallelizing compiler for from single chip multiprocessor to a high performance multiprocessor and a heterogeneous supercomputer cluster. OSCAR multigrain parallelizing compiler exploits coarse grain task parallelism and near ne gra...
متن کاملCache Optimization for Coarse Grain Task Parallel Processing Using Inter-Array Padding
The wide use of multiprocessor system has been making automatic parallelizing compilers more important. To improve the performance of multiprocessor system more by compiler, multigrain parallelization is important. In multigrain parallelization, coarse grain task parallelism among loops and subroutines and near fine grain parallelism among statements are used in addition to the traditional loop...
متن کاملExperience with Fine-Grain Communication in EM-X Multiprocessor for Parallel Sparse Matrix Computation
Sparse matrix problems require a communication paradigm different from those used in conventional distributed-memory multiprocessors. We present in this paper how fine-grain communication can help obtain high performance in the experimental distributed-memory multiprocessor, EM-X, developed at ETL, which can handle fine-grain communication very efficiently. The sparse matrix kernel, Conjugate G...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003